Search CORE

PINTA: a web server for network-based gene prioritization from expression data

Author: Arner
Baldi
Chen
D. Nitsch
Franke
Hristovski
Hutz
Irizarry
J. K. Vogt
J. P. Goncalves
Kohler
L.-C. Tranchevent
Nitsch
Radivojac
S. C. Madeira
Seelow
Y. Moreau
Yu
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

PINTA (available at http://www.esat.kuleuven.be/pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein–protein interaction network. Our strategy is meant for biological and medical researchers aiming at identifying novel disease genes using disease specific expression data. PINTA supports both candidate gene prioritization (starting from a user defined set of candidate genes) as well as genome-wide gene prioritization and is available for five species (human, mouse, rat, worm and yeast). As input data, PINTA only requires disease specific expression data, whereas various platforms (e.g. Affymetrix) are supported. As a result, PINTA computes a gene ranking and presents the results as a table that can easily be browsed and downloaded by the user

Online Research Database In Technology

ReLiance: a machine learning and literature-based prioritization of receptor—ligand pairings.

Author: De Moor B.
Iacucci Ernesto
Moreau Y.
Pavlopoulos Georgios A.
Popovic D.
Schneider Reinhard
Tranchevent L. C.
Publication venue
Publication date: 01/01/2012
Field of study

Motivation: The prediction of receptor—ligand pairings is an important area of research as intercellular communications are mediated by the successful interaction of these key proteins. As the exhaustive assaying of receptor—ligand pairs is impractical, a computational approach to predict pairings is necessary. We propose a workflow to carry out this interaction prediction task, using a text mining approach in conjunction with a state of the art prediction method, as well as a widely accessible and comprehensive dataset. Among several modern classifiers, random forests have been found to be the best at this prediction task. The training of this classifier was carried out using an experimentally validated dataset of Database of Ligand-Receptor Partners (DLRP) receptor—ligand pairs. New examples, co-cited with the training receptors and ligands, are then classified using the trained classifier. After applying our method, we find that we are able to successfully predict receptor—ligand pairs within the GPCR family with a balanced accuracy of 0.96. Upon further inspection, we find several supported interactions that were not present in the Database of Interacting Proteins (DIPdatabase). We have measured the balanced accuracy of our method resulting in high quality predictions stored in the available database ReLiance. Availability: http://homes.esat.kuleuven.be/?bioiuser/ReLianceDB/ index.php Contact: [email protected]; ernesto.iacucci@gmail. com Supplementary information: Supplementary data are available at Bioinformatics onlin

Open Repository and Bibliography - Luxembourg

Integration of multiple data sources to prioritize candidate genes using discounted rating system

Author: A Hamosh
A similarity-based method for genome-wide prediction of disease-relevant human genes
C Perez-Iratxeta
C Stark
D Botstein
D Lin
EA Adie
F Turner
J Xu
Jagdish C Patra
JJ Jiang
JM Stuart
K Järvelin
L Lovasz
LC Tranchevent
M Mistry
MA Harris
MG Anne
N López-Bigas
P Resnik
S Aerts
S Kohler
S Peri
T De Bie
Y Li
Yongjin Li
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Identifying disease gene from a list of candidate genes is an important task in bioinformatics. The main strategy is to prioritize candidate genes based on their similarity to known disease genes. Most of existing gene prioritization methods access only one genomic data source, which is noisy and incomplete. Thus, there is a need for the integration of multiple data sources containing different information. Results: In this paper, we proposed a combination strategy, called discounted rating system (DRS). We performed leave one out cross validation to compare it with N-dimensional order statistics (NDOS) used in Endeavour. Results showed that the AUC (Area Under the Curve) values achieved by DRS were comparable with NDOS on most of the disease families. But DRS worked much faster than NDOS, especially when the number of data sources increases. When there are 100 candidate genes and 20 data sources, DRS works more than 180 times faster than NDOS. In the framework of DRS, we give different weights for different data sources. The weighted DRS achieved significantly higher AUC values than NDOS. Conclusion: The proposed DRS algorithm is a powerful and effective framework for candidate gene prioritization. If weights of different data sources are proper given, the DRS algorithm will perform better

Springer - Publisher Connector

DR-NTU (Digital Repository of NTU)

Swinburne Research Bank

Candidate gene prioritization by network analysis of differential expression using machine learning approaches

Author: A Subramanian
A Zanzoni
AJ Smola
AP Francisco
B Aranda
B Harr
Bart de Moor
C Saunders
C Stark
C von Mering
D Nitsch
D Zieker
Daniela Nitsch
F Chung
F Fouss
Fabian Ojeda
GC Cawley
GD Bader
H Yang
HY Chuang
J Chen
JA Hanley
Joana P Gonçalves
JW Park
K Lage
KR Brown
L Franke
L Gautier
L Salwinski
LC Tranchevent
M Liu
P Baldi
P Pagel
R Gupta
RA Irizarry
RI Kondor
RK Nibbe
S Aerts
S Köhler
S Mirkin
S Razick
S Vardhanabhuti
SE Choe
T Fawcett
WK Lim
Y Saad
Yves Moreau
Z Wu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals. To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network. Results We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (<it>Simple Expression Ranking</it>). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the <it>Heat Kernel Diffusion Ranking </it>leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%. Conclusion In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype.</p

Springer - Publisher Connector

arXiv.org e-Print Archive

ProDiGe: Prioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Author: A Su
B Brancotte
B Calvo
B Linghu
B Liu
B Schölkopf
B Schölkopf
B Schölkopf
C Giallourakis
C Perez-Iratxeta
C Son
CC Chang
EA Adie
F Denis
F Mordelet
Fantine Mordelet
FS Turner
G Lanckriet
GRG Lanckriet
J Freudenberg
Jean-Philippe Vert
K Bleakley
K Lage
L Jacob
L Jacob
LC Tranchevent
M van Driel
N López-Bigas
N Tiffin
O Vanunu
P Pavlidis
RI Kondor
S Aerts
S Köhler
S Yu
T De Bie
T Evgeniou
T Hwang
U Ala
V McKusick
X Wu
Y Yamanishi
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Elucidating the genetic basis of human diseases is a central goal of genetics and molecular biology. While traditional linkage analysis and modern high-throughput techniques often provide long lists of tens or hundreds of disease gene candidates, the identification of disease genes among the candidates remains time-consuming and expensive. Efficient computational methods are therefore needed to prioritize genes within the list of candidates, by exploiting the wealth of information available about the genes in various databases. Results We propose ProDiGe, a novel algorithm for Prioritization of Disease Genes. ProDiGe implements a novel machine learning strategy based on learning from positive and unlabeled examples, which allows to integrate various sources of information about the genes, to share information about known disease genes across diseases, and to perform genome-wide searches for new disease genes. Experiments on real data show that ProDiGe outperforms state-of-the-art methods for the prioritization of genes in human diseases. Conclusions ProDiGe implements a new machine learning paradigm for gene prioritization, which could help the identification of new disease genes. It is freely available at <url>http://cbio.ensmp.fr/prodige</url>.</p

Springer - Publisher Connector

Endeavour update: a web resource for gene prioritization in multiple species

Author: Adie
Aerts
Aerts
Ashburner
B. Coessens
B. De Moor
Bader
Ebermann
Elbers
Gasteiger
Glenisson
Hamosh
Hovatta
Hristovski
Jimenez-Sanchez
L.-C. Tranchevent
Lopez-Bigas
Mewes
Mulder
Oti
P. Van Loo
Perez-Iratxeta
Peri
R. Barriot
Rossi
S. Aerts
S. Van Vooren
S. Yu
Salwinski
Smith
Son
Stark
Tiffin
Turner
van Driel
Walker
Xia
Y. Moreau
Ye
Zhu
Publication venue: Oxford University Press
Publication date
Field of study

Endeavour (http://www.esat.kuleuven.be/endeavourweb; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes. Using a training set of genes known to be involved in a biological process of interest, our approach consists of (i) inferring several models (based on various genomic data sources), (ii) applying each model to the candidate genes to rank those candidates against the profile of the known genes and (iii) merging the several rankings into a global ranking of the candidate genes. In the present article, we describe the latest developments of Endeavour. First, we provide a web-based user interface, besides our Java client, to make Endeavour more universally accessible. Second, we support multiple species: in addition to Homo sapiens, we now provide gene prioritization for three major model organisms: Mus musculus, Rattus norvegicus and Caenorhabditis elegans. Third, Endeavour makes use of additional data sources and is now including numerous databases: ontologies and annotations, protein–protein interactions, cis-regulatory information, gene expression data sets, sequence information and text-mining data. We tested the novel version of Endeavour on 32 recent disease gene associations from the literature. Additionally, we describe a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis

Public Library of Science (PLOS)

Network Analysis of Differential Expression for the Identification of Disease-Causing Genes

Author: AM Yip
Bernard Thienpont
C Moehle
C von Mering
Daniela Nitsch
DB Mount
DN Cox
EH Rosenberg
EK Malmberg
F Fouss
FJ Probst
FR Bach
FR Bach
Gustavo Goldman
H Parkinson
Hilde Van Esch
HY Chuang
J Johnson
JM Wright
JR Riordan
K Kyo
K Lage
Koenraad Devriendt
L Bubendorf
L Franke
Lieven Thorrez
Léon-Charles Tranchevent
M Bakay
M Cortón
M Simoni
M Urbanek
M Urbanek
MR Jones
N Kotaja
P Moretti
PE Becker
RI Kondor
S Aerts
S Draghici
S Fine
S Franks
S Ina
S Kuramochi-Miyagawa
S Köhler
SS Tanaka
T Barrett
T Noce
T Watanabe
TK Gandhi
Y Nishimura
Yves Moreau
Z Yao
Publication venue: Public Library of Science
Publication date: 01/05/2009
Field of study

Genetic studies (in particular linkage and association studies) identify chromosomal regions involved in a disease or phenotype of interest, but those regions often contain many candidate genes, only a few of which can be followed-up for biological validation. Recently, computational methods to identify (prioritize) the most promising candidates within a region have been proposed, but they are usually not applicable to cases where little is known about the phenotype (no or few confirmed disease genes, fragmentary understanding of the biological cascades involved). We seek to overcome this limitation by replacing knowledge about the biological process by experimental data on differential gene expression between affected and healthy individuals. Considering the problem from the perspective of a gene/protein network, we assess a candidate gene by considering the level of differential expression in its neighborhood under the assumption that strong candidates will tend to be surrounded by differentially expressed neighbors. We define a notion of soft neighborhood where each gene is given a contributing weight, which decreases with the distance from the candidate gene on the protein network. To account for multiple paths between genes, we define the distance using the Laplacian exponential diffusion kernel. We score candidates by aggregating the differential expression of neighbors weighted as a function of distance. Through a randomization procedure, we rank candidates by p-values. We illustrate our approach on four monogenic diseases and successfully prioritize the known disease causing genes

Lirias

iPSC-Derived Microglia as a Model to Study Inflammation in Idiopathic Parkinson's Disease.

Author: Anne Grünewald
Anne Grünewald
Carmen Venegas
Enrico Glaab
Jens C. Schwamborn
Katja Badanjak
Leon-Charles Tranchevent
Nico Diederich
Patrycja Mulica
Paul M. A. Antony
Paul M. A. Antony
Sally A. Cowley
Sandro L. Pereira
Semra Smajic
Sylvie Delcambre
Thomas Rauen
Publication venue
Publication date: 01/01/2021
Field of study

Parkinson's disease (PD) is a neurodegenerative disease with unknown cause in the majority of patients, who are therefore considered "idiopathic" (IPD). PD predominantly affects dopaminergic neurons in the substantia nigra pars compacta (SNpc), yet the pathology is not limited to this cell type. Advancing age is considered the main risk factor for the development of IPD and greatly influences the function of microglia, the immune cells of the brain. With increasing age, microglia become dysfunctional and release pro-inflammatory factors into the extracellular space, which promote neuronal cell death. Accordingly, neuroinflammation has also been described as a feature of PD. So far, studies exploring inflammatory pathways in IPD patient samples have primarily focused on blood-derived immune cells or brain sections, but rarely investigated patient microglia in vitro. Accordingly, we decided to explore the contribution of microglia to IPD in a comparative manner using, both, iPSC-derived cultures and postmortem tissue. Our meta-analysis of published RNAseq datasets indicated an upregulation of IL10 and IL1B in nigral tissue from IPD patients. We observed increased expression levels of these cytokines in microglia compared to neurons using our single-cell midbrain atlas. Moreover, IL10 and IL1B were upregulated in IPD compared to control microglia. Next, to validate these findings in vitro, we generated IPD patient microglia from iPSCs using an established differentiation protocol. IPD microglia were more readily primed as indicated by elevated IL1B and IL10 gene expression and higher mRNA and protein levels of NLRP3 after LPS treatment. In addition, IPD microglia had higher phagocytic capacity under basal conditions-a phenotype that was further exacerbated upon stimulation with LPS, suggesting an aberrant microglial function. Our results demonstrate the significance of microglia as the key player in the neuroinflammation process in IPD. While our study highlights the importance of microglia-mediated inflammatory signaling in IPD, further investigations will be needed to explore particular disease mechanisms in these cells

Oxford University Research Archive

Open Repository and Bibliography - Luxembourg

TargetMine, an Integrated Data Warehouse for Candidate Gene Prioritisation and Target Discovery

Author: A Bairoch
A Birkland
A Burgun
A Garcia Castro
A Joachimiak
A Kasprzyk
AG Murzin
C Linhart
C Perez-Iratxeta
C Stark
C Yamasaki
CJ van Rijsbergen
D Cheng
D Nitsch
D Seelow
DS Latchman
DS Wishart
DW Nebert
EA Adie
G Hripcsak
H Ge
HM Berman
J Chen
J Chen
J Shi
JD Osborne
JD Watson
JE Hutz
JP Helfrich
JP Overington
K Lage
Kenji Mizuguchi
KF Aoki-Kinoshita
L Wong
LD Stein
Lo-C Tranchevent
Lokesh P. Tripathi
LP Tripathi
LS Chen
M Ashburner
M Cornell
M Gerstein
OL Griffith
PJ Kersey
R Lyne
S Aerts
S Ahmad
S Hunter
S Kohler
S Velankar
SP Shah
TJ Lee
Vladimir Uversky
WS Noble
X Chen
Y Murakami
Y Yang
Yi-An Chen
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/

CiteSeerX

Public Library of Science (PLOS)